

# NANO SCATTER: Towards Ambient IoT

Fengyuan Zhu<sup>1</sup>, Jiaqi Shen<sup>2</sup>, Wenhui Li<sup>1</sup>, Jianyu Luo<sup>1</sup>, Renjie Zhao<sup>3</sup>, Linling Zhong<sup>1</sup>,  
Bingbing Wang<sup>1</sup>, Xiaohua Tian<sup>1</sup> \*

<sup>1</sup>Shanghai Jiao Tong University, <sup>2</sup>East China Normal University, <sup>3</sup>Johns Hopkins University

## Abstract

Ambient IoT (A-IoT) aims to connect hundreds of billions of ultra-low-power and battery-free devices, which has been included in the agenda for 6G standardization by 3GPP. Backscatter communication is considered the mainstream enabling technique for A-IoT; however, current state-of-the-art can hardly meet A-IoT's main technical requirements simultaneously: power consumption below 100  $\mu$ W, communication ranges up to 100 m, and 100+ concurrency. This paper presents NANO SCATTER, the first backscatter network with each tag implemented using our customized backscatter communication ASIC. We propose a nanowatt wake-up receiver design and a sensitivity-driven downlink/uplink modulation mechanism to carry out the ASIC, which enables minimizing the tag's power consumption and long-range communication. NANO SCATTER supports concurrent communication of 6 IC-based tags with a subcarrier capacity of 512, achieving communication distances of 66 m indoors and 100 m outdoors. The tag consumes 1  $\mu$ W in idle listening, with the core circuit using 58 nW and 43  $\mu$ W during communication.

## CCS Concepts

- Networks → End nodes; • Hardware → Integrated circuits.

### ACM Reference Format:

Fengyuan Zhu<sup>1</sup>, Jiaqi Shen<sup>2</sup>, Wenhui Li<sup>1</sup>, Jianyu Luo<sup>1</sup>, Renjie Zhao<sup>3</sup>, Linling Zhong<sup>1</sup>, Bingbing Wang<sup>1</sup>, Xiaohua Tian<sup>1</sup>. 2025. NANO SCATTER: Towards Ambient IoT. In *The 31st Annual International Conference on Mobile Computing and Networking (ACM MOBICOM '25)*, November 4–8, 2025, Hong Kong, China. ACM, New York, NY, USA, 15 pages. <https://doi.org/10.1145/3680207.3723476>

\*Corresponding author. E-mail: xtian@sjtu.edu.cn.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. *ACM MOBICOM '25, November 4–8, 2025, Hong Kong, China*

© 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.

ACM ISBN 979-8-4007-1129-9/2025/11  
<https://doi.org/10.1145/3680207.3723476>



Figure 1: NANO SCATTER employs a system-IC co-design approach. (a) Chip tray with 50 customized backscatter dies. (b) Concurrent tags, each built around one die.

## 1 Introduction

Ambient IoT (A-IoT) aims to connect hundreds of billions of ultra-low-power and battery-free devices, which has been included in the agenda for 6G standardization by 3GPP [1, 2]. Ultra-low-power means the capability of maintenance-free operation for years. 3GPP technical report provides A-IoT's application scenarios such as automated warehousing, electronic shelf labels, forest fire monitoring, and smart farms. These crucial applications require the following key performance indicators (KPIs) [2]: (1) The power consumption shall be less than 1  $\mu$ W for passive devices (battery-free) and 1  $\mu$ W to 100  $\mu$ W for semi-passive devices (no battery replenishment) to minimize maintenance cost. (2) The communication range of both downlink and uplink shall be tens to hundreds of meters to cover most indoor and outdoor applications. (3) The network shall support node density ranging from 1 to 15 per square meter which translates to hundreds to thousands of devices within a cell's coverage area.

Backscatter communication is widely considered as a key enabling technique for A-IoT. Over the past decade, numerous efforts have been made to advance backscatter system design [3–14]. However, despite these advancements, the current state-of-the-art (SoTA) still leaves performance gaps compared with 3GPP's A-IoT vision:

(1) **Unreliable power estimation:** Without IC integration, it is impossible to evaluate the system performance under real  $\mu$ W power constraint. Complete IC integration needs to address circuit-level challenges and incurs half a million dollars in tape-out expenses. The  $\mu$ W-level communication claimed in the literature is normally obtained by simulating with limited function blocks. In particular, existing work only simulates certain uplink components, primarily the modulator logic and RF switches [3, 4, 7, 8, 10, 15–17], while

overlooking essential modules such as clock generation or power supply circuits. However, these neglected modules account for a large portion of the power consumption in the backscatter baseband [18, 19].

(2) **Underexplored downlink design:** In A-IoT, the receiver spends more time in downlink mode than in uplink mode, even though the volume of uplink data is typically greater than that of the downlink. This is because IoT devices must remain active to listen for network management signals from the gateway. For instance, a forest sensor needs to frequently upload data to the gateway (uplink), but it must continuously listen for gateway instructions to determine when to transmit (downlink). However, existing designs are overly optimistic regarding the downlink receiver's power consumption and sensitivity. Passive envelope detectors consume zero power but generally suffer from poor sensitivity [20–22], limiting downlink communication ranges to around 10 m, which is 1 – 2 orders of magnitude shorter than A-IoT range requirements and backscatter uplink counterparts. Conversely, recent low-power downlink designs aim to improve receiver sensitivity, but their power consumption reaches hundreds of  $\mu\text{W}$ , 1 – 2 orders of magnitude higher than the power required for backscatter communication [23, 24].

(3) **Synchronization challenge for high-concurrency communication:** Existing high-concurrency solutions primarily focus on achieving concurrent uplink backscatter communication, but they underestimate the challenge of achieving high-accuracy synchronization [4, 8, 16, 25]. These high-concurrency designs rely on sample-level accurate synchronization, which can only be accomplished by active receivers consuming hundreds of  $\mu\text{W}$  [26]. Such high power consumption can cause a long-life-span IoT device like Apple AirTag, to reduce operational time from  $\sim 1$  year to around 4 months.

To bridge those performance gaps, we present NANO SCATTER, the first backscatter communication network with tags implemented using our customized backscatter communication ASIC, where the tag's standby and communication power levels are nW and  $\mu\text{W}$  respectively. NANO SCATTER supports concurrent backscatter communication, with a capability of 6 tags demonstrated under current experimental conditions, while the subcarrier capacity can accommodate up to 512 tags. Our design approaches are as follows:

(1) **System-on-chip (SoC) integration.** By integrating all essential circuits into a single chip (as shown in Fig. 1), our system-IC co-design approach ensures no critical circuits overlooked. This integration, for the first time, allows for comprehensive verification of  $\mu\text{W}$ -level concurrent backscatter functionality.

(2) **Wake-up first downlink workflow.** We employ a nW-level wake-up receiver that continuously monitors the



**Figure 2: Required sensitivity of tag receiver and backscatter receiver based on path loss analysis [27].**

channel at a low rate. The high-power synchronization circuit is activated only for microseconds just before backscattering, ensuring that the tag's power consumption remains extremely low when it performs idle listening.

(3) **Sensitivity-driven long-range design.** The design of the on-tag downlink demodulation and uplink modulation is based on specific sensitivity requirements. As illustrated in Fig. 2, we analyze path loss under the long-range requirement to determine the necessary sensitivity levels for the tag's downlink receivers and the backscatter Rx. The sensitivity of the backscatter Rx further determines the uplink modulation scheme, ensuring that the total noise power in the subcarrier is squeezed below the backscatter power, providing a sufficient signal-to-noise ratio (SNR).

Based on the approaches, we propose the following designs to build an ultra-low-power, high-concurrency, and long-range backscatter communication network that meets A-IoT's main technical requirements:

**Design 1: Nanowatt wake-up receiver for downlink.** The wake-up receiver listens for specific wireless signals in the channel and activates subsequent circuits. By employing a power-efficient circuit structure, reducing active components and minimizing the frequency and voltage of necessary active circuits, we achieve nW-level wake-up (without voltage regulation) while meeting the sensitivity requirement. To mitigate potential false triggers which could lead to unnecessary waste of stored energy, we propose a patterned wake-up approach instead of edge-triggered wake-up. This reduces the false trigger probability by approximately  $3.3 \times 10^4$  times under the optimal threshold, compared to the edge-triggered wake-up used in SoTA [8, 10, 14, 21, 22, 28, 29].

**Design 2: Fine-grained synchronization protocol and synchronization receiver to counteract random wake-up delays.** We design a synchronization receiver with a high sampling rate of 1 MHz without sacrificing sensitivity. The protocol ensures that the receiver is only temporarily activated to conserve power. By modeling the wake-up delay from the nanowatt wake-up receiver in Design 1, we observe a significant random distribution in wake-up times, which poses challenges for successful synchronization within a short time window. To address this, we develop a multiple-chance synchronization protocol that can tolerate a maximum random delay of 2.8 ms for any tag in the network,

fully covering the range of random wake-up delays. With this protocol, our synchronization receiver achieves  $1\text{ }\mu\text{s}$  synchronization accuracy while reducing synchronization power by 54% using an early-exit mechanism.

**Design 3: Narrow-band OFDMA uplink for long-range concurrent transmissions.** To achieve both long-range and concurrent backscatter communication, we adopt narrow-band OFDMA which reduces in-band noise power through a narrow spectrum slice. We design a baseband that provides modulation programmability and select a 312.5 kHz bandwidth to ensure that the in-band noise of each subcarrier is minimized. Additionally, we set the subcarrier length to 512 samples, which significantly reduces the noise power in each tag's subcarrier, providing over 30 dB high SNR for the worst-case deployment. Ultimately, our narrow-band design accomplishes high concurrency and extended range at the same time. Coupled with a high-sensitivity wake-up receiver, this uplink design allows our tags to communicate from any location between the transmitter and receiver, even when separated by distances of up to 100 m.

As shown in Fig. 1(b), we built NANO SCATTER tags using a taped-out integrated circuit along with the necessary peripheral passive components. The IC measures  $2\text{ mm} \times 2\text{ mm}$ , and contains all active function blocks, including the wake-up receiver, synchronization receiver, backscatter circuits, and clock and power generation circuits. The basestation Tx and Rx were developed using USRP, and the system was evaluated to achieve the following results:

- NANO SCATTER tag consumes  $0.058\text{ }\mu\text{W}$  without a low-dropout regulator (LDO) and  $1\text{ }\mu\text{W}$  with an LDO, achieving a sensitivity of  $-51\text{ dBm}$  in idle listening mode. In synchronization mode, it consumes  $61.4\text{ }\mu\text{W}$  at  $-54\text{ dBm}$  sensitivity, and  $43\text{ }\mu\text{W}$  during backscatter communication. The Idle listening power is reduced by  $50\times$  compared to the SoTA [26].
- NANO SCATTER tags can achieve sub- $1\text{ }\mu\text{s}$  sample-level synchronization which enables high concurrency backscatter communication, with experimental support for 6-tag concurrency and a subcarrier capacity of 512.
- NANO SCATTER tags can communicate at any location between the Tx and Rx, with Tx-Rx distances of 66 m and 100 m indoors and outdoors, respectively.
- The tag can also operate without batteries, thanks to the integrated RF energy harvesting circuits. Instead of using an expensive and bulky supercapacitor, the energy storage medium is a  $10\text{ }\mu\text{F}$ -level MLCC capacitor.

## 2 System Overview

As shown in Fig. 3, our system operates in four main stages:

- (1) **Idle Stage:** Both the transmitter (Tx) and tags remain in idle mode. The tags continuously listen at a power consumption level of only tens of nanowatts.



Figure 3: System overview.



Figure 4: Timing diagram of communication.

- (2) **Wake-up Stage:** The Tx transmits a wake-up signal to the tags. Each tag demodulates this signal, maintaining the same low power consumption as in the idle stage, and activates its main circuits upon successful demodulation of the wake-up signal.
- (3) **Synchronization Stage:** After transmitting the wake-up signal, the Tx sends a fine-grained synchronization signal. In this stage, the tags, with their main circuits already activated, perform precise synchronization with the transmitted signal.
- (4) **Data Transmission Stage:** The Tx sends a single-tone carrier signal, allowing the tags to perform concurrent backscatter for data transmission. Meanwhile, the receiver (Rx) performs parallel demodulation.

The timing diagram of the communication process is illustrated in Fig. 4. Prior to and during the transmission of the wake-up signal by the Tx, both tag A and tag B perform idle listening at  $1\text{ }\mu\text{W}$  power consumption. Upon receiving the wake-up signal, tags A and B initiate the power-on procedures for the synchronization and backscatter circuits, resulting in random wake-up delays. The Tx then sends a synchronization frame comprising multiple subframes, allowing tags A and B to achieve accurate synchronization despite their different wake-up delays. Finally, the Tx sends the single-tone carrier signal, during which tags A and B perform concurrent backscatter using subcarrier frequencies of  $\Delta f_A$  and  $\Delta f_B$ , respectively.

## 3 NANO SCATTER System Design

In this section, we explain the detailed design of NANO SCATTER on nW-level wake-up receiver, fine-grained synchronization, and narrow-band OFDMA uplink for long-range concurrent transmissions.

### 3.1 Nanowatt wake-up receiver

Since idle listening is the default state for all NANO SCATTER tags and significantly impacts overall power consumption, the primary objective of the wake-up receiver design is to

meet the power budget for semi-passive or even fully passive A-IoT devices. Additional goals for the wake-up receiver include minimizing false alarms to avoid unnecessary energy consumption and ensuring reliable wake-up functionality to enhance the success rate of backscatter communication, even in the presence of interference.

**3.1.1 Meeting the power budget.** To prioritize power efficiency, we design a near-zero power wake-up receiver based on two guiding principles: (1) Avoiding the use of active components wherever possible, and (2) Minimizing the power consumption of any necessary active components. In line with the principles, we select OOK modulation for the wake-up downlink.

The basic architecture of an OOK wake-up receiver comprises the following components: a matching network, rectifier, comparator, oscillator, and digital correlator. To meet the power budget, we implement the following design choices: (1) Employ a passive rectifier that operates without bias or active amplification. (2) Minimize the clock frequency to the lowest feasible level, as the power consumption of active circuits is predominantly dynamic and scales linearly with the oscillator's clock frequency. The dynamic power dissipation follows  $P = \sum_i \frac{1}{2} C_i U_i^2 \cdot f$ , where  $i$  represents the  $i^{\text{th}}$  transistor, and  $f$  is the clock frequency. (3) Reduce the supply voltage of all active components to 0.4 V, which is significantly lower than the standard operating voltage of 1.1 V for the IC process we use. We explain the detailed IC implementation of each module in Sec. 4.

**3.1.2 Reducing false alarms.** The simplified wake-up receiver performing OOK demodulation is prone to false positive wake-ups triggered by noise or interference. False positive wake-ups (false alarms) are particularly detrimental, as they will trigger the following high-power stages and waste precious energy. Furthermore, when a tag is falsely awakened by noise while others respond to the true wake-up signal, it can cause leakage into other subcarriers due to misaligned symbol timing, resulting in bit error rate (BER) degradation.

To mitigate false alarms, we implement a 16-bit wake-up pattern rather than relying on a simple rising-edge-based wake-up scheme. This pattern-based approach significantly reduces the likelihood of false alarms, with the probability decreasing exponentially as the pattern length increases. Suppose the wake-up receiver is monitoring an idle channel filled with noise, the comparator output can be modeled as a Bernoulli process generating binary outputs. Based on this model, for a wake-up pattern of length  $N$  bits and a comparator output probability  $p$  for bit 1 (with probability  $1-p$  for bit 0), the false alarm occurrence for each comparison clock cycle follows a geometric distribution. Its probability



(a) Different wake-up pattern (b) Different bit 1 counts

Figure 5: Expectations of false alarm period.

is given by:

$$P_{FA}(N) = p^a \cdot (1-p)^{N-a}, \quad (1)$$

where  $a$  represents the number of bit 1s in the stored local pattern. The expected value of the false alarm period can then be calculated as  $T_{FA} = \frac{T_b}{P_{FA}}$ , where  $T_b$  is the bit duration.

We then derive the false alarm period by varying the number of bits  $N$  and bit 1 counts  $a$ , as shown in Fig. 5. With  $T_b$  set to 4 ms, Fig. 5(a) demonstrates that traditional rising-edge wake-up (2 bits) has a false alarm period of 16 ms, whereas a 16-bit pattern increases the period exponentially to 250 000 ms. Fig. 5(b) shows the performance of the 16-bit wake-up pattern under varying bit 1 counts, confirming that  $p = 0.5$  provides consistent performance across different bit 1 distributions. We select a 16-bit wake-up pattern and incorporate a dynamic threshold circuit to maintain a comparator threshold ensuring  $p \approx 0.5$ , and the circuit details are elaborated in Sec. 4.1.

We note that the overhead of pattern-based detection arises solely from the duration of the pattern, as the computation process introduces no additional latency. The pattern detector is realized by a group of shift registers that store the historical output of the comparator in a first-in-first-out (FIFO) manner. Consequently, correlation is performed and completed within each clock cycle, ensuring real-time operation without computational delay.

**3.1.3 Warm-up signal for robust performance.** We adopt OOK modulation to reduce the power consumption of the wake-up receiver, but the typical OOK demodulator design with a fixed threshold (i.e. the reference of the comparator) [21] usually has unreliable performance due to the variable strength of the wake-up signal and interference. Therefore, we adopt a dynamic threshold circuit that adjusts the threshold based on the past received signal strength. Unfortunately, the dynamic threshold in low-power circuits can cause negative effects on demodulation. The root cause is that the dynamic threshold is generated by a simple integration unit that relies solely on historical power levels, making it unresponsive to incoming wake-up signals. As illustrated in Fig. 6(a), when the channel is idle, with no transmitted signal,



**Figure 6: (a) The sensitivity loss issue. (b) Effects of the warm-up sequence. (c) Complete wake-up signal.**

the threshold tends to drop to a low level, leading to a loss of sensitivity at the beginning of the next wake-up.

To resolve this issue, we design a warm-up sequence to precede the actual wake-up pattern. This warm-up sequence consists of alternating 1/0 bits, which assists the wake-up receiver in adjusting the threshold to an appropriate level. As shown in Fig. 6(b), the 1/0 bits elevate the comparator threshold from an initially low level to a suitable voltage level. Furthermore, we adopted DC-balanced Manchester encoding for the wake-up pattern to maintain the appropriate threshold during the wake-up signal decoding. The complete wake-up signal is depicted in Fig. 6(c), which includes the stabilization 1/0 sequence followed by the Manchester encoded wake-up pattern.

It is important to note that the warm-up sequence effectively addresses the limitations of the dynamic threshold in low-interference scenarios, while the dynamic threshold itself ensures reliable performance in high-interference environments.

### 3.2 Fine-grained synchronization protocol

The top priority in the design of the synchronization stage is achieving precise timing at the sample level, which is essential for high-concurrency backscatter. However, using the wake-up receiver for synchronization is impractical due to its extremely low sampling rate, which introduces a millisecond-level offset. This offset can misalign thousands of samples in the backscatter baseband, potentially causing concurrent transmissions to fail. To address this, a dedicated synchronization receiver is necessary to reduce the wake-up offset to the microsecond level, corresponding to less than 10 samples in the backscatter baseband. This offset can be effectively managed by incorporating a small guard interval between adjacent symbols, such as the cyclic prefix (CP) widely used in 4G/5G and OFDM-based Wi-Fi networks. While the receiver requires high accuracy, its power consumption is less critical because it is only activated temporarily.

**3.2.1 Synchronization receiver circuit.** Since the synchronization receiver is activated only temporarily, our design focuses on achieving a high sampling rate without compromising sensitivity. The synchronization receiver (shown in Fig. 7) follows a similar low-power architecture to the



**Figure 7: The synchronization receiver circuit.**

wake-up receiver with two major differences. First, we introduce an LNA to ensure high sensitivity across a higher bandwidth. Second, we adopt current-biased rectifier with lower threshold voltage to further enhance sensitivity. The comparator samples the channel at 1 MHz and streams the bits to the digital correlator, which is capable of detecting and decoding 16 predefined synchronization patterns, to be further detailed in Sec. 3.2.6.

The necessity of using a digital correlator to detect patterns, rather than relying on rising edge detection, stems from two primary drawbacks associated with rising edge detection: (1) False synchronization: the millisecond-level variability in wake-up delay distribution allows both noise and minor interference to trigger a rising edge. (2) Missed synchronization: rising edges occur within microseconds, making them easy to miss if a tag experiences a significant wake-up delay. In the next section, we will analyze the wake-up delay in more detail.

**3.2.2 Wake-up delay components.** The wake-up delay is defined as the interval between the end time of the wake-up signal transmission at the Tx and the start time of synchronization stage at the tag. As shown in Fig. 8, it can be divided into two main components: (1) Delay caused by the wake-up receiver's sampling time offset (STO), referred to as self delay; and (2) Delay caused by the subsequent circuits, primarily the start-up time of the backscatter circuit after the wake-up receiver is activated.

**(1) Self delay.** The tag's wake-up receiver samples the wake-up signal transmitted by the Tx. Ideally, it should immediately sample the envelope waveform and demodulate the bits with time-domain alignment with the Tx's symbol transmission. However, in practice, the sampling moments are determined by the rising edges of the receiver's own sampling clock, which is unsynchronized with the Tx's symbol clock. While this does not affect demodulation or successful wake-up, the wake-up flag is pulled up after a delay relative to the end of the wake-up signal at the Tx side. This delay, caused by the wake-up receiver's STO, is referred to as self delay.

**(2) Power-on delay.** While the wake-up receiver monitors the channel, the synchronization and backscatter circuits are fully powered down to conserve energy. Once the wake-up flag is triggered, these circuits initiate their power-on sequence. During this brief phase, the tag is unable to



Figure 8: Wake-up delay components.

communicate. It is important to note that power-on delay is an inherent hardware characteristic and cannot be entirely eliminated. In a network with a single backscatter device, this delay can be measured and compensated at the receiver. However, in networks with hundreds or thousands of concurrent tags, variations in power-on delay among different tags can introduce significant timing errors that exceed the receiver's ability to compensate.

**3.2.3 Need for designing a power-on sequence.** Power-on process is required when the tag transfers from the idle listening state to the synchronization stage after being waken up. This is due to the mechanism of the wake-up-based low-power IoT: all high-power circuits are powered off when the wake-up receiver is listening, and powered on when the wake-up receiver detects the wake-up signal.

A straightforward approach to powering on the synchronization circuits is to sequentially activate all circuits after wake-up. However, this method, referred to as the *basic power-on sequence*, introduces unpredictable wake-up delays, which can lead to inefficiencies and errors in the communication process.

The basic power-on sequence proceeds as follows: Once the wake-up flag is asserted, the voltage regulator circuits for the backscatter system are enabled. This event triggers the Power-On Reset (POR), resetting all register states within the backscatter circuit. Meanwhile, the crystal oscillator for the backscatter radio begins to oscillate, eventually reaching full amplitude and a 50% duty cycle, making it ready to drive the backscatter digital baseband. Finally, the power supply circuit of an active OOK receiver is activated, allowing for sample-level synchronization and subsequent backscatter communication.

The total delay  $t_{\text{total, basic}}$  of the basic power-on process is expressed as:

$$t_{\text{total, basic}} = t_{\text{wake}} + t_{\text{power}} + t_{\text{por}} + t_{\text{xtal}} \quad (2)$$

where  $t_{\text{wake}}$  represents the wake-up receiver's self-delay caused by the Sampling Time Offset (STO),  $t_{\text{power}}$  denotes the time between the power switch-on event and the voltage regulation event,  $t_{\text{por}}$  is the duration required for the POR process, and  $t_{\text{xtal}}$  is the start-up time of the crystal oscillator. Together,  $t_{\text{power}}$ ,  $t_{\text{por}}$ , and  $t_{\text{xtal}}$  constitute the power-on delay.



Figure 9: Power-on sequence design.

Directly constructing a probability model for the power-on delay in the process above is challenging due to variations between tags. Simulations using Monte Carlo methods within IC design software are also prohibitively time-consuming, often taking weeks to produce a single result, making them impractical for distribution analysis. In the subsequent sections, we introduce an optimized power-on sequence that simplifies the acquisition of the chip's delay distribution.

**3.2.4 Power-on sequence design.** The purpose of the power-on sequence design is to minimize the uncertainty in wake-up delays while reducing the active time of the LNA to conserve power. This design is orchestrated by a scheduler, which organizes the entire process and extends the power-on completion to a more predictable moment.

As illustrated in Fig. 9, the scheduler postpones the LNA power-on event by 6 ms, making it the final step in the power-on sequence. During this 6 ms period, other circuits, primarily the XCO and digital correlator, proceed through the standard power-on procedure in parallel. The standard power-on procedure, commonly adopted, involves the LDO powering up, which then triggers the Power-On Reset (POR), after which the circuit becomes operational. For the XCO, an additional start-up process is required before it is fully ready due to the inherent characteristics of the crystal.

The delay duration of 6 ms is determined based on simulation results conducted under SS and FF process corners and extreme voltage and temperature conditions. These simulation results suggest a boundary around 5 ms, leading to the adoption of a 6 ms delay to align with the 2 kHz scheduling clock.

**Benefit:** This design simplifies the prediction of backscatter timing by decoupling  $t_{\text{total}}$  from the complex start-up processes of the subsystems. An updated delay analysis will be provided in the subsequent section.

**Expense:** The improved power-on process incurs a higher communication overhead. However, this overhead is acceptable for narrow-band communication. For instance, the typical frame length in NANO SCATTER is 326 ms, and the 6 ms delay introduces an additional overhead of only 1.8%.

**3.2.5 Wake-up delay model.** Based on the improved power-on sequence, the wake-up delay distribution can now be acquired. Given that the scheduler now governs the conclusion of the wake-up process, the total wake-up delay is influenced by both the counter start time delay and the counter end



Figure 10: Estimated distribution of the wake-up delay.



Figure 11: The generation of combined Barker code.

time delay. Thus, the improved total delay  $t_{\text{total\_improved}}$  can be expressed as:

$$t_{\text{total\_improved}} = t_{\text{wake}} + t_{\text{schedule\_6ms}} + t_{\text{lna\_power}}, \quad (3)$$

where  $t_{\text{wake}}$  is consistent with the delay in the basic power-on process,  $t_{\text{schedule\_6ms}}$  represents the scheduled 6 ms delay, and  $t_{\text{lna\_power}}$  denotes the delay associated with the LNA power-on process.

In this model,  $t_{\text{wake}}$  remains a tag-unspecific delay, while  $t_{\text{schedule\_6ms}}$  and  $t_{\text{lna\_power}}$  are tag-specific delays. The  $t_{\text{schedule\_6ms}}$  delay, which is centered around 6 ms, is a random, tag-specific delay influenced by clock jitter. Meanwhile,  $t_{\text{lna\_power}}$  is a random, tag-specific delay typically in the microsecond range. Both  $t_{\text{schedule\_6ms}}$  and  $t_{\text{lna\_power}}$  can be modeled as normal distributions. Here the term ‘tag-specific delay’ means that the delay varies due to different tags’ transistor-level variations in the fabrication process.

Given the prior knowledge of circuit behavior, we observe that  $t_{\text{lna\_power}} \ll t_{\text{schedule\_6ms}}$ , allowing us to omit  $t_{\text{lna\_power}}$  from further analysis. Simulations of the 2 kHz relaxation oscillator’s period indicate that it follows a normal distribution  $N(500 \mu\text{s}, 17 \mu\text{s})$ . Since  $t_{\text{schedule\_6ms}}$  comprises 12 periods, its distribution becomes  $N(6 \text{ ms}, 59 \mu\text{s})$ .

By summing these distributions, we obtain the combined distribution for the total wake-up delay, as illustrated in Fig. 10. The predicted total wake-up delay ranges from approximately 5.7 ms to 7.3 ms.

**3.2.6 Synchronization frame.** There are three requirements for the synchronization frame: (1) The synchronization frame should provide multiple chances for the synchronization receiver that operates with the delay modeled above, indicating that there exist multiple synchronization elements; (2) Each synchronization element should have strong auto-correlation performance; (3) For those tags that synchronizes to different synchronization elements, they should ultimately be scheduled to backscatter simultaneously, as required by OFDMA.



(a) 11-bit Barker code.

(b) Combined Barker code.

Figure 12: Auto correlation comparison between ideal Barker code and proposed combined code.

To meet requirement (1) and (3), the synchronization frame contains 16 subframes, distributed in a time span covering the wake-up delay. Each subframe is embedded with a 4-bit ID, indicating the scheduled backscatter time; meanwhile, each subframe also contains synchronization pattern. Each tag is designed to decode this ID in parallel to the pattern correlation process and immediately trigger a countdown counter after successful correlation and ID decoding. This is realized by using parallel logic written in Verilog.

We now introduce the bit-level contents in subframes to meet requirement (2). Our idea is to combine multiple barker codes to form a longer barker code as the subframe contents. Barker codes are known to have ideal autocorrelation performance. We choose an 11-bit Barker code ‘10110111000’ with a 4-bit ID using overlay encoding. To be specific, each bit of the 4-bit ID code XNORs with all 11 bits in the barker code, as illustrated in Fig. 11, using subframe ID#3 as an example. Then, we concatenate the 4 obtained groups of 11 bits in sequence and construct the main part of a subframe. Note that this part of bits only lasts for 88  $\mu\text{s}$ , and the remaining part of the subframe (another 88  $\mu\text{s}$ ) is filled with alternating 1/0, using the same warm-up signal in Sec. 3.1.3. We find that the synthesized longer barker code can still have a good autocorrelation performance. Fig. 12a shows the autocorrelation result of the 11-bit Barker code working as an atom in our subframe; and Fig. 12b illustrates the maxima and minima curve of 16 combinations. The combinations of Barker code can achieve similar performance when  $lags < 11$ .

Finally, we place these 16 subframes sequentially from  $ID = 15$  to  $ID = 0$  to form the complete synchronization frame, lasting for 2.8 ms. This leaves some redundancy compared to the wake-up delay model.

**3.2.7 Power saving strategy of the synchronization receiver.** Now we present how the synchronization receiver conserves power during the demodulation of synchronization subframes, with the overall goal of minimizing its active time. To achieve this, we design an early exit mechanism. After being awakened and powered on, the synchronization receiver performs demodulation and correlation until either a subframe is detected or the maximum waiting time (the duration of two subframes) is reached. At this point, the



**Figure 13: Synchronization frame and the early-exist power saving strategy of the synchronization receiver.**

power-hungry analog front end, primarily the LNA, is shut down, and a counter is employed to schedule the backscatter stage. The counter’s starting value is determined based on the detected subframe ID:  $T_{\text{count}} = T_{\text{subframe}} \times ID$ . This approach ensures that the counter reaches zero precisely when the synchronization frame ends and the backscatter phase begins. As a result, the high-power-consumption active Rx only operates for a brief period, lasting around 0.25 ms.

This power-saving strategy also supports concurrent synchronization of tags that wake up with different delays. For example, consider tags A and B in Fig. 13. Tag A wakes up early in the wake-up delay distribution and is ready to use the active OOK Rx to synchronize with the subframes. Since the wake-up event occurs during the C15 subframe, tag A misses the C15 subframe because the correlation result does not reach the maximum. It then detects the C14 subframe, obtains the countdown value of 14, and turns off the Rx to save power. With this mechanism, the receiver is only required to be on for at most two subframes. If the receiver remains active for two subframes without successful synchronization, it exits early, achieving an inner-frame duty-cycling effect.

Suppose that the synchronization circuit consumes power  $P_{\text{sync\_on}}$  when turned on and  $P_{\text{sync\_off}}$  when turned off, the averaged power is:

$$P_{\text{average}} = \frac{P_{\text{sync\_on}} \times T_{\text{sync\_on}} + P_{\text{sync\_off}} \times T_{\text{sync\_off}}}{T_{\text{sync\_on}} + T_{\text{sync\_off}}}. \quad (4)$$

We can derive that our strategy saves power by around 54% based on the measured data in Sec. 5.2.

### 3.3 Backscatter design

Once high-accuracy synchronization is achieved, concurrent backscatter communication can be realized. One of the key objectives of NANO SCATTER is to enable long-range operation at the hundred-meter level, which must also be supported in the backscatter uplink. To meet this challenge, our design principle focuses on employing *narrow-band* OFDMA to achieve long-range, concurrent backscatter communication.

The content is structured as follows: first, we introduce the design of our baseband, which maintains programmability even after ASIC implementation; second, we discuss

the narrow-band design and demonstrate how this approach meets the link budget, facilitating long-range operations; finally, we present the complete backscatter process, considering both the transmitter and receiver aspects.

**3.3.1 Baseband design.** As shown in Fig. 14, the backscatter baseband consists of a generic phase modulator, RAM, crystal oscillator (XCO), and an RF switch-based impedance network. The generic phase modulator is the core component of the baseband, responsible for frequency shifting and data modulation. The RAM stores the modulator’s physical parameters as well as the payload bits, while the XCO provides an accurate local clock for the modulator.

The generic phase modulator is inspired by the advanced modulator presented in [30], supporting configurable second-order phase modulation. We opt for this flexible framework instead of directly fixing the physical layer parameters because the ASIC implementation naturally freezes the digital design, making it unmodifiable post-implementation. To mitigate this inflexibility and allow further changes in the tag, we chose to use this framework. After being triggered by the synchronization receiver, the modulator first configures itself using parameters stored in the RAM. It then converts the input 10 MHz XCO clock to the desired sampling frequency and shift frequency using programmable frequency synthesizers. Finally, it performs backscatter modulation based on the configured parameters.

Next, we discuss how this design enables NANO SCATTER tags to perform OFDMA concurrent transmission. Two main parameters of the generic phase modulator are configured: subcarrier phase and subcarrier frequency. The modulator allows each tag to perform frequency shifting with a resolution of 4.7 Hz and a range of 0 MHz to 10 MHz. This flexibility allows each tag to be configured to a target subcarrier frequency by deriving the frequency code for the modulator. Suppose the total bandwidth is  $BW$ ; then each tag can be assigned a subcarrier  $f_i = \frac{i}{N} \times BW$ , where  $N$  is the total number of subcarriers. Additionally, each tag’s subcarrier is shifted 2 MHz away from the single-tone excitation signal frequency. This shift is achieved by adding the 2 MHz NCO phase to the subcarrier’s phase. Each tag performs BPSK modulation, where the phase is loaded at the initial sample of the symbol, carrying the bit information.

Finally, the generic phase modulator outputs the phase  $\text{phase}_i(t)$  that satisfies the following expression:

$$\text{phase}_i(t) = \begin{cases} 2\pi \times (f_{2.5\text{MHz}} + f_i)t + 0, & \text{when bit} = 1 \\ 2\pi \times (f_{2.5\text{MHz}} + f_i)t + \pi, & \text{when bit} = 0 \end{cases} \quad (5)$$

The phase signal is then converted into a 1-bit control signal for switching the antenna load in the backscatter tag’s impedance network. As illustrated in Fig. 14,  $Z_{L1}$  is chosen



Figure 14: Backscatter baseband circuit design.



Figure 15: SNR under different bandwidth and symbol length settings.

when  $0 \leq \text{phase}_i(t) < \pi$ , and  $Z_{L2}$  is chosen when  $\pi \leq \text{phase}_i(t) < 2\pi$ .

**3.3.2 Meeting link budget.** After designing the baseband for backscatter modulation, we now determine the modulator parameters to meet the link budget. Our key insight is that by using a narrow-band design, we can achieve long-range backscatter communication.

We define the bandwidth of each tag as  $BW/N$ , where  $N$  is the total number of subcarriers. The noise in the  $i^{\text{th}}$  tag's subchannel can be continuously reduced by decreasing the total bandwidth  $BW$ :

$$P_N(BW) = \int_{f_i - \frac{BW}{2N}}^{f_i + \frac{BW}{2N}} PSD(f) \cdot df \cdot NF \approx kT \cdot \frac{BW}{N} \cdot NF, \quad (6)$$

where  $PSD$  is the power spectral density of thermal noise in the channel,  $k$  is the Boltzmann constant,  $T$  is the temperature in Kelvin, and  $NF$  is the noise factor of the receiver, which is typically 3 dB. This equation shows that reducing the bandwidth decreases the noise power, thereby increasing the SNR even when the signal power is very weak.

In Fig. 15, we present the backscatter SNR as the bandwidth ( $BW$ ) increases from 312.5 kHz to 10 MHz across different symbol lengths  $N$ . The transmission power of the tag,  $P_{\text{tag}}$ , is calculated using the Friis equation [20, 27], and the SNR is given by  $\text{SNR} = \frac{P_{\text{tag}}}{P_N}$ . The value of  $P_{\text{tag}}$  used in the calculations is an empirically determined  $-110$  dBm, reflecting the worst-case scenario, where the tag is positioned at the midpoint between the transmitter and receiver. The results show that when  $BW$  is reduced to 312.5 kHz, the backscatter tag can successfully communicate at the midpoint when the transmitter and receiver are 100 m apart, achieving a theoretical SNR exceeding 30 dB. Based on these findings, we select 312.5 kHz as the optimal bandwidth and 512 as  $N$  to maximize the SNR while achieving a sufficient data rate. To



Figure 16: NANO SCATTER tag.

mitigate residual synchronization errors, a cyclic prefix (CP) of 16 samples is added to each symbol.

## 4 Hardware and IC design

In NANO SCATTER, the basestation Tx is implemented with a USRP B210 device. The Tx device transmits the wake-up signal, synchronization frame, and single tone at 433 MHz. The transmitting power level is elevated to 30 dBm by adding a customized power amplifier (PA) to the USRP. The receiver is another USRP B210 device operating at 435 MHz, aligned with the 2 MHz frequency shift performed by the tags.

As illustrated in Fig. 16, Each NANO SCATTER tag is built upon a custom integrated circuit fabricated using the SMIC 40 nm ultra-low-power process and mounted on a custom-designed printed circuit board (PCB). The digital circuits of the tag are synthesized using Synopsys DC, while the analog circuits are designed in Cadence Virtuoso. The die is directly mounted onto the customized PCB using wire bonding. The PCB itself is entirely passive, containing no active components and operates independently without the need for an external daughterboard. Following we will introduce the detailed design of the circuits.

Unless otherwise specified, the antenna used on the Tx has a measured gain of 2 dBi, while the antennas on the tags and Rx have a measured gain of  $-3$  dBi.

### 4.1 Wake-up receiver

The entire wake-up receiver operates at 0.4 V with a clock frequency of 2 kHz. The circuit is illustrated in Fig.17(a), and its implementation details are as follows.

**Rectifier.** We utilize a transistor-based pseudo-balun rectifier. The gate and source terminals of the transistor are merged, forming a unified terminal. Consequently, the transistor functions as a diode, with the anode being the drain terminal and the cathode being the unified gate-source terminal. To enhance sensitivity, the rectifier adopts a differential architecture, with each side employing a 2-stage Dickson rectifier structure.

**Clock-driven comparator.** Following the rectifier, a clock-driven comparator is employed to demodulate the OOK signal by comparing the real-time analog envelope signal with the dynamic threshold. To conserve power, the comparator uses a tiny capacitor to sample the voltage from the rectifier at the clock's rising edges, performing voltage comparisons



Figure 17: (a) Wake-up receiver circuit. (b) Dynamic threshold circuit of the wake-up receiver. (c) XCO of the backscatter circuit.

at the clock’s falling edges. How the dynamic threshold is generated is to be presented below.

**Dynamic threshold.** The comparator’s dynamic threshold is controlled by a digital integrator as illustrated in Fig.17(b). Each time the integrator reads the comparator output  $D_{in}$ , it adjusts the threshold level by incrementing or decrementing the value stored in its register based on  $D_{in}$ . This register value is then converted to an analog voltage at the negative input port of the comparator by using a nW capacitive digital-to-analog converter (CDAC). This feedback mechanism ensures that the threshold adapts dynamically to the variations of input signal, maintaining near-optimal detection sensitivity.

**Digital correlator.** The digital correlator calculates the correlation between the 1-bit comparator output sequence  $\{X_{in}[n]\}$  and a stored local sequence  $\{X_0[n]\}$ , where  $n \in \{0, 1, 2, \dots, N-1\}$  and  $N$  is the correlation length. If, at a moment  $t = T_0$ , the input sequence matches the stored local sequence, i.e.,  $\forall i \in \{0, 1, 2, \dots, N-1\}, X_{in}[i] = X_0[i]$ , the correlator outputs a logic ‘1’ as a wake-up flag. In our hardware design, the value  $N$  is set to 16.

**Relaxation Oscillator.** The relaxation oscillator generates the clock signal for both the digital correlator and the comparator, operating at a frequency of 2 kHz. This low frequency helps achieve power consumption in the nW range. The oscillation period is determined by the capacitance and resistance within the circuit. Compared to a ring oscillator, another common low-power oscillator, the relaxation oscillator is more power-efficient at kHz-level frequencies.

**Low-voltage operation.** The wake-up receiver operates at a reduced voltage of approximately 0.4 V, within the sub-threshold region. In this region, transistors are partially conducted, significantly lowering the total power consumption compared to standard voltage conditions.

## 4.2 Backscatter circuit

In addition to the digital baseband introduced in Sec. 3.3, the backscatter circuit includes the following components:

**Voltage regulators.** These regulators obtain power from an external energy source (such as a capacitor for energy

storage) and provide a constant output voltage. The regulators used include 1.1 V LDO circuits for the backscatter circuits, and 0.4 V LDO circuits for the tiny RAM.

**Reference circuits.** These circuits generate a reference voltage source and a reference current source that are independent of the chip’s supply voltage, temperature, and process variations.

**Low-power register.** This register stores all programmable parameters of the circuit, which can be accessed and modified by SPI ports. The programmable parameters include the analog circuits’ configuration bits, backscatter baseband parameters, and backscatter payload data.

**XCO.** The crystal oscillator comprises an off-chip 10 MHz crystal and an on-chip crystal driver, as shown in Fig.17. To expedite the oscillator’s start-up process, we use a ring oscillator operating at the same frequency to provide an initial injection signal, which is turned off once the start-up is complete.

## 4.3 Battery-free operation

NANO SCATTER realizes battery-free operation through RF energy harvesting. The energy harvesting circuit consists of a rectifier for AC-DC conversion, a charge pump circuit that increases the voltage output, and a power management unit to control the charge and discharge states. The rectifier is external to the chip and is made of an SMS7630 Schottky diode.

The harvested energy is stored on an off-chip capacitor. Unlike FPGA/MCU-based prototypes that require an expensive supercapacitor for energy storage, we utilize a cost-effective and widely available multi-layer ceramic capacitor (MLCC) with a capacitance in the  $\mu$ F range. The power consumption of each stage, as well as the achievable number of workflow cycles after energy harvesting, are evaluated in Sec. 5.1.3.

## 5 Evaluation

### 5.1 Micro-benchmarks

**5.1.1 Wake-up receiver.** **(1) Sensitivity measurement.** We measure the wake-up signal detection rate under different



Figure 18: Sensitivity of the wake-up receiver and the synchronization receiver.



Figure 19: Delay distributions of the wake-up and the synchronization process.

transmission power levels via a wired connection. The wake-up signal is transmitted by an Agilent E4438C analog signal generator, which precisely controls the output power level. The wake-up receiver is directly connected to the transmitter via a coaxial line. A successful wake-up event is defined as the successful demodulation of all 16 bits in one wake-up signal transmission. Sensitivity is defined as the transmission power level at which the packet detection rate exceeds 99.9%. The resulting packet detection rate is illustrated in Fig. 18a, indicating that our wake-up receiver achieves a sensitivity of -51 dBm. When the input power exceeds -51 dBm, the packet detection rate consistently reaches 100.0%.

**(2) Delay distribution.** We connect both the transmitter RF signal and the tag’s LNA enable flag to an oscilloscope to monitor the wake-up delay, using the end of the wake-up signal as the reference point. Our experiments accumulate the results of  $1 \times 10^4$  receptions. As shown in Fig. 19a, the wake-up delay ranges from 6.2 ms to 6.8 ms, which aligns with our estimations shown in Fig. 10.

**5.1.2 Synchronization receiver.** **(1) Sensitivity measurement.** Similar to the wake-up event measurements, we use an oscilloscope to monitor a debug pin that indicates the backscatter flag, signifying successful synchronization. For this measurement, we bypass the wake-up receiver by keeping the wake-up flag always high. The transmitter is wired directly to the synchronization receiver via a coaxial cable. We control the transmitter’s power level and record the packet detection rate across 1000 consecutive synchronization frames. The results, as shown in Fig. 18b, indicate that the sensitivity, defined at a packet detection rate of 99.9%, is -54.5 dBm.



Figure 20: Backscatter after RF energy harvesting. MLCC capacitors of 76  $\mu$ F and 152  $\mu$ F are used.



Figure 21: Power consumption with time.

**(2) Delay distribution.** To measure the synchronization delay, we monitor both the transmitter and the tag’s backscatter flag using an oscilloscope, with the start of the synchronization frame as the reference point. As illustrated in Fig. 19b, the synchronization delay ranges from 0.02  $\mu$ s to 0.90  $\mu$ s, aligning well with the estimated accuracy of 1  $\mu$ s achieved using 1 MHz sampling rate in synchronization.

**5.1.3 RF energy harvesting.** **(1) RF energy harvester sensitivity.** We perform wire-connected measurements to accurately obtain the sensitivity. In the wired measurement, we use a coaxial line to connect the analog signal transmitter Agilent E4438C and our tag. We monitor the voltage of the 76  $\mu$ F capacitor and find that -16 dBm is the sensitivity of our RF energy harvester.

**(2) Backscatter with harvested energy.** We now measure the voltage of the energy storage capacitor after harvesting enough energy. we set the charging end to 2 V, which is the start voltage when the backscatter process begins. As shown in the blue curve in Fig. 20, each voltage drop in the voltage-time curve represents one cycle of backscatter communication. It spends 10 s to use up all the energy on the 76  $\mu$ F capacitor after 6 cycles of operation. In this process, the voltage drops from 2 V to 1.2 V, the lowest voltage for normal operation. If replacing the capacitor with a larger one, we can obtain a longer discharge time. After that, the tag returns to the state of energy harvesting. The orange curve in Fig. 20 shows the discharge time when using a 152  $\mu$ F capacitor. It realizes 16 cycles of backscatter communication and a duration of 22 s.

## 5.2 Power consumption

**Wake-up receiver.** We first measure the power consumption of the wake-up receiver alone. Our chip does not have direct I/O exposure for measuring the wake-up receiver current ( $I_0$ ). However, the entire chip and the backscatter radio do



Figure 22: Indoor floor plan and experiment setup.



Figure 23: Indoor LoS performance.



Figure 24: Indoor NLoS performance.

have LDO I/O exposure. Thus, we measure the current of the whole chip ( $I_1$ ) and the current of the backscatter radio alone ( $I_2$ ). Using Kirchhoff's first law, we calculate the wake-up receiver current as  $I_0 = I_1 - I_2 = 0.726 \mu\text{A} - 0.582 \mu\text{A} = 0.144 \mu\text{A}$ . Additionally, using the LDO debug I/O, we measure the voltage across the wake-up receiver circuit,  $V_0 = 0.4 \text{ V}$ . Summarizing these measurements, the total power consumption of the wake-up receiver is  $P = I_0 \times V_0 = 57.7 \text{ nW}$ . This measurement was conducted using a high-accuracy digital ammeter, the Keysight 34470A, which offers pA-level precision.

**NANO SCATTER tag.** We power the tag at 1.5 V using a bench voltage source and place an ammeter in series with the circuit for accurate current measurement. The transmitter is controlled to send the excitation signal, and the current variation  $I(t)$  is simultaneously recorded. The power consumption is calculated as  $P(t) = I(t) \times 1.5 \text{ V}$ . The results, shown in Fig. 21, indicate that the power consumption during idle listening, synchronization, and backscatter communication are 1.1  $\mu\text{W}$ , 61.4  $\mu\text{W}$ , and 43.2  $\mu\text{W}$ , respectively.

### 5.3 Indoor performance

Now we evaluate the indoor performance of the NANO SCATTER system. The experiment setup is depicted in Fig. 22. A tag is randomly selected and placed in the corridor of an



Figure 25: Outdoor experiment setup.



Figure 26: Outdoor performance of concurrent tags.

office building, with the transmitter positioned at the corridor's end near the staircase. All office rooms are isolated with concrete walls exceeding 20 cm in thickness.

In the line-of-sight (LoS) setup, the receiver is placed 100 m away from the transmitter. For the non-line-of-sight (NLoS) setup, the receiver is placed at three different locations (marked as A, B, and C) within a room.

The indoor LoS results in Fig. 23 demonstrate that the tag can communicate reliably at any position within a Tx-tag distance of 66 m in LoS scenarios, achieving a BER below  $10^{-5}$  and a stable goodput (error-free throughput) around 447 bps. When the Tx-tag distance reaches 66 m, the BER suddenly increases because the wake-up receiver fails to demodulate wake-up signals, which further causes empty uplink packets.

The NLoS results in Fig. 24 illustrate the BER and goodput under NLoS scenarios. These results indicate that even when the Tx-Rx path includes multiple concrete walls as obstacles, the backscatter signal can still be demodulated with a BER below  $10^{-3}$  and a goodput above 400 kbps, with a Tx-tag distance of 14 m to 24 m. Among the NLoS locations, the tag at location A outperforms those at locations B and C, due to the presence of richer reflection paths enhancing signal propagation.

### 5.4 Outdoor Performance

The setup for evaluating outdoor performance is illustrated in Fig. 25. We deployed six tags for concurrent backscatter communication, placing them between the transmitter and the receiver. The transmitter operates at a transmission power of 30 dBm at 433 MHz and is positioned 100 m away from the receiver.

The performance metrics, including BER and goodput, at various locations are presented in Fig. 26. The concurrent

tags demonstrate reliable communication at any location within a Tx-tag distance of 100 m. Most tags maintain a BER below  $10^{-3}$ , and the aggregated goodput exceeds 2.1 kbps. The BER and goodput fluctuations observed among the six tags outdoors likely stem from uneven ground conditions, which cause random changes in antenna orientation and variations in antenna gain at different locations.

## 5.5 Comparison with SoTA

Tab. 1 summarizes the performance comparison with SoTA systems supporting concurrent backscatter communication. Since other works lack goodput metrics, throughput values are used where applicable. The tag Rx power reflects downlink power consumption during idle listening and synchronization. Key points include: (a) NANO SCATTER achieves 10  $\mu$ W-level measured power for both uplink and downlink, while others report 100  $\mu$ W-level power or lack measured data. (b) NANO SCATTER supports a maximum Tx-tag distance of 100 m, maintaining communication throughout, unlike other systems constrained by downlink range.

**Table 1: Comparison with other works**

| Performance                  | NanoScatter          | P <sup>2</sup> LoRa | DigiScatter | NetScatter |
|------------------------------|----------------------|---------------------|-------------|------------|
| Backscatter power (measured) | 43 $\mu$ W           | 320 $\mu$ W         | NA          | NA         |
| Tag Rx power (measured)      | 1 $\mu$ W/61 $\mu$ W | NA                  | NA          | NA         |
| Tx-tag distance              | 100 m                | <30 m               | <6 m        | <25 m      |
| Concurrency (exp./limit)     | 6/512                | 101/101             | 300/1019    | 256/256    |
| Synchronization accuracy     | 1 $\mu$ s            | >4.5 $\mu$ s        | 0.5 $\mu$ s | NA         |
| Goodput (per tag)            | 0.5 kbps             | 0.1 kbps            | 18.8 kbps   | 0.9 kbps   |

## 6 Discussions

**Dynamic range.** Dynamic range is not critical for the Rx due to the 2 MHz frequency shift applied by all tags, which enables filtering out the strong Tx signal and isolating the backscatter signal for reliable reception.

**Power consumption in idle listening.** The wake-up receiver's power consumption is 58 nW, yet the total tag consumption during idle listening is higher at 1.1  $\mu$ W. This discrepancy arises from our use of a conservative LDO to ensure that the chip can be operational after power-on. If a more aggressive design were used, simulations suggest that a 100 nW LDO could suffice, which is our future work.

**Available tags.** Our experiment utilizes six tags, constrained by the need for manual measurement and individual tuning of customized ICs. Automating this process can enable a scalable and efficient deployment of tags, which is already a mature method in the industry.

**Limited data rate.** The data rate of each tag in our system (0.6 kbps) is a deliberate trade-off between SNR under the maximum range and the data rate. It can be increased by expanding the bandwidth  $BW$  as illustrated in Fig. 15.

## 7 Related Works

**(1) Wake-up receiver.** IC-based wake-up receivers can realize nW-level power while achieving high sensitivity in various bands [31–33]. Our work leverages the nanowatt wake-up receiver to reduce downlink power.

**(2) Backscatter communication.** Backscatter communication is known to be a promising technology for achieving  $\mu$ W-level wireless uplink. Apart from RF backscatter [3–13, 23, 24, 34–39], recent advances in backscatter communication include the support for terahertz electromagnetic wave [40], and underwater backscatter [41, 42] supporting acoustic wave. Most of these works are functionally verified by FPGA/MCU-based prototype, with modulator power simulated by IC design tools.

Recently, a few studies have explored backscatter communication using real IC implementations. *SyncScatter* explores IC-based Wi-Fi-compatible backscatter previously tested on FPGA/MCU-based prototypes [26]. However, it is limited to 802.11b protocol, and only supports 1-2 tags coexisting, limiting its scalability for A-IoT applications. Additionally, Jeeva Wireless reportedly designs chips based on academic innovations, but the specific performance and technical details are not publicly disclosed [43]. This paper proposes the first backscatter network with complete IC integration, addressing the existing limitations.

**(3) Ultra-low-power receivers:** As backscatter communication performance improves, the low-power downlink becomes a critical bottleneck. Recent studies have explored long-range downlink solutions [23, 24, 44, 45]. However, none of these approaches achieve high sensitivity with power consumption below 100  $\mu$ W, making them unsuitable for A-IoT applications.

## 8 Conclusions

In this paper, we introduce NANO SCATTER, an IC-based concurrent backscatter network system designed to meet the critical requirements of ambient IoT that so far have been performance gaps, including unreliable power estimation, underexplored downlink, and synchronization challenge for high-concurrency. Our design provides a concurrency capacity of 512, and evaluation results show that it achieves 58 nW idle listening and 43  $\mu$ W communicating, with downlink sensitivity of -51 dBm and can concurrently communicate at arbitrary point between the Tx and Rx with a Tx-Rx distance up to 100 m. NANO SCATTER tag can also communicate in a battery-free manner with RF energy harvesting.

## Acknowledgements

The work in this paper is supported by the National Key Research and Development Program of China 2020YFB1708700, and National Natural Science Foundation of China (No. 61922055, 61872233, 62272293).

## References

[1] M. M. Butt, N. R. Mangalvedhe, N. K. Pratas, J. Harrebek, J. Kimionis, M. Tayyab, O.-E. Barbu, R. Ratasuk, and B. Vejlgaard, “Ambient iot: A missing link in 3gpp iot devices landscape,” *IEEE Internet of Things Magazine*, vol. 7, no. 2, pp. 85–92, 2024.

[2] 3rd Generation Partnership Project (3GPP), “Study on ambient power-enabled internet of things (iot),” Technical Report 3GPP TR 22.840 V2.2.0, Technical Specification Group Services and System Aspects, December 2023. Release 19.

[3] J. Zhao, W. Gong, and J. Liu, “X-tandem: Towards multi-hop backscatter communication with commodity wifi,” in *Proceedings of the 24th Annual International Conference on Mobile Computing and Networking*, MobiCom ’18, (New York, NY, USA), p. 497–511, Association for Computing Machinery, 2018.

[4] R. Zhao, F. Zhu, Y. Feng, S. Peng, X. Tian, H. Yu, and X. Wang, “OFDMA-enabled wi-fi backscatter,” in *The 25th Annual International Conference on Mobile Computing and Networking (MobiCom 19)*, pp. 1–15, 2019.

[5] A. Varshney, A. Soleiman, and T. Voigt, “Tunnelscatter: Low power communication for sensor tags using tunnel diodes,” in *The 25th Annual International Conference on Mobile Computing and Networking*, MobiCom ’19, (New York, NY, USA), Association for Computing Machinery, 2019.

[6] M. Rostami, K. Sundaresan, E. Chai, S. Rangarajan, and D. Ganesan, “Redefining passive in backscattering with commodity devices,” in *Proceedings of the 26th Annual International Conference on Mobile Computing and Networking*, MobiCom ’20, (New York, NY, USA), Association for Computing Machinery, 2020.

[7] S. Li, C. Zhang, Y. Song, H. Zheng, L. Liu, L. Lu, and M. Li, “Internet-of-microchips: direct radio-to-bus communication with spi backscatter,” in *Proceedings of the 26th Annual International Conference on Mobile Computing and Networking*, MobiCom ’20, (New York, NY, USA), Association for Computing Machinery, 2020.

[8] J. Jiang, Z. Xu, F. Dang, and J. Wang, “Long-range ambient lora backscatter with parallel decoding,” MobiCom ’21, (New York, NY, USA), p. 684–696, Association for Computing Machinery, 2021.

[9] F. Dehbashi, A. Abedi, T. Brecht, and O. Abari, “Verification: can wifi backscatter replace rfid?,” in *Proceedings of the 27th Annual International Conference on Mobile Computing and Networking*, MobiCom ’21, (New York, NY, USA), p. 97–107, Association for Computing Machinery, 2021.

[10] X. Guo, Y. He, Z. Yu, J. Zhang, Y. Liu, and L. Shangguan, “Rf-transformer: a unified backscatter radio hardware abstraction,” MobiCom ’22, (New York, NY, USA), p. 446–458, Association for Computing Machinery, 2022.

[11] R. Menon, R. Gujarathi, A. Saffari, and J. R. Smith, “Wireless identification and sensing platform version 6.0,” in *Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems*, SenSys ’22, (New York, NY, USA), p. 899–905, Association for Computing Machinery, 2023.

[12] H. Dong, Y. Xie, X. Zhang, W. Wang, X. Zhang, and J. He, *GPSMirror: Expanding Accurate GPS Positioning to Shadowed and Indoor Regions with Backscatter*. New York, NY, USA: Association for Computing Machinery, 2023.

[13] K. Qian, L. Yao, K. Zheng, X. Zhang, and T. N. Ng, *UniScatter: a Metamaterial Backscatter Tag for Wideband Joint Communication and Radar Sensing*. New York, NY, USA: Association for Computing Machinery, 2023.

[14] C. Du, J. Yu, R. Zhang, J. Ren, and J. An, “Orthcatter: High-throughput in-band OFDM backscatter with over-the-air code division,” in *21st USENIX Symposium on Networked Systems Design and Implementation*, NSDI 2024, Santa Clara, CA, April 15-17, 2024 (L. Vanbever and I. Zhang, eds.), USENIX Association, 2024.

[15] A. Gupta, D. Park, S. Bashar, C. Girerd, T. K. Morimoto, and D. Bharadia, “Wiforcesticker: Batteryless, thin sticker-like flexible force sensor,” *CoRR*, vol. abs/2209.09217, 2022.

[16] M. Hessar, A. Najafi, and S. Gollakota, “Netscatter: Enabling large-scale backscatter networks,” in *Proceedings of the 16th USENIX Conference on Networked Systems Design and Implementation (NSDI 19)*, pp. 271–283, 2019.

[17] M. Xie, M. Jin, F. Zhu, Y. Zhang, X. Tian, X. Wang, and C. Zhou, “Enabling high-rate backscatter sensing at scale,” *ACM MobiCom ’24*, (New York, NY, USA), p. 124–138, Association for Computing Machinery, 2024.

[18] P.-H. P. Wang, C. Zhang, H. Yang, M. Dunna, D. Bharadia, and P. P. Mercier, “A low-power backscatter modulation system communicating across tens of meters with standards-compliant wi-fi transceivers,” *IEEE Journal of Solid-State Circuits*, vol. 55, no. 11, pp. 2959–2969, 2020.

[19] S.-K. Kuo, M. Dunna, H. Lu, A. Agarwal, D. Bharadia, and P. P. Mercier, “21.5 an lte-harvesting ble-to-wifi backscattering chip for single-device rfid-like interrogation,” in *2023 IEEE International Solid-State Circuits Conference (ISSCC)*, pp. 320–322, IEEE, 2023.

[20] B. Kellogg, V. Talla, S. Gollakota, and J. R. Smith, “Passive wi-fi: Bringing low power to wi-fi transmissions,” in *13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16)*, pp. 151–164, 2016.

[21] P. Zhang, D. Bharadia, K. Joshi, and S. Katti, “Hitchhike: Practical backscatter using commodity wifi,” in *Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems (SenSys 16)*, pp. 259–271, 2016.

[22] P. Zhang, C. Josephson, D. Bharadia, and S. Katti, “Freerider: Backscatter communication using commodity radios,” in *Proceedings of the 13th International Conference on emerging Networking EXperiments and Technologies (CoNEXT 17)*, pp. 389–401, 2017.

[23] S. Li, H. Zheng, C. Zhang, Y. Song, S. Yang, M. Chen, L. Lu, and M. Li, “Passive DSSS: empowering the downlink communication for backscatter systems,” in *19th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2022, Renton, WA, USA, April 4-6, 2022* (A. Phanishayee and V. Sekar, eds.), pp. 913–928, USENIX Association, 2022.

[24] Y. Song, L. Lu, J. Wang, C. Zhang, H. Zheng, S. Yang, J. Han, and J. Li, “μmote: Enabling passive chirp de-spreading and μw-level long-range downlink for backscatter devices,” in *20th USENIX Symposium on Networked Systems Design and Implementation*, NSDI 2023, Boston, MA, April 17-19, 2023 (M. Balakrishnan and M. Ghobadi, eds.), pp. 1751–1766, USENIX Association, 2023.

[25] F. Zhu, Y. Feng, Q. Li, X. Tian, and X. Wang, “Digiscatter: Efficiently prototyping large-scale ofdma backscatter networks,” in *Proceedings of the 18th International Conference on Mobile Systems, Applications, and Services (MobiSys 20)*, pp. 42–53, 2020.

[26] M. Dunna, M. Meng, P. Wang, C. Zhang, P. P. Mercier, and D. Bharadia, “Syncscatter: Enabling wifi like synchronization and range for wifi backscatter communication,” in *18th USENIX Symposium on Networked Systems Design and Implementation*, NSDI 2021, April 12-14, 2021 (J. Mickens and R. Teixeira, eds.), pp. 923–937, USENIX Association, 2021.

[27] H. Friis, “A note on a simple transmission formula,” *Proceedings of the IRE*, vol. 34, no. 5, pp. 254–256, 1946.

[28] X. Liu, Z. Chi, W. Wang, Y. Yao, P. Hao, and T. Zhu, “Verification and redesign of {OFDM} backscatter,” in *18th USENIX symposium on networked systems design and implementation (NSDI 21)*, pp. 939–953, 2021.

[29] Z. Chi, X. Liu, W. Wang, Y. Yao, and T. Zhu, “Leveraging ambient LTE traffic for ubiquitous passive communication,” in *Proceedings of the*

*Annual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication (SIGCOMM 20)*, pp. 172–185, 2020.

[30] F. Zhu, M. Ouyang, L. Feng, Y. Liu, X. Tian, M. Jin, D. Chen, and X. Wang, “Enabling software-defined phy for backscatter networks,” in *Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services*, MobiSys ’22, (New York, NY, USA), p. 330–342, Association for Computing Machinery, 2022.

[31] H. Jiang, P.-H. P. Wang, L. Gao, P. Sen, Y.-H. Kim, G. M. Rebeiz, D. A. Hall, and P. P. Mercier, “24.5 a 4.5 nw wake-up radio with- 69dbm sensitivity,” in *2017 IEEE International Solid-State Circuits Conference (ISSCC)*, pp. 416–417, IEEE, 2017.

[32] J. Moody, P. Bassirian, A. Roy, N. Liu, S. Pancrazio, N. S. Barker, B. H. Calhoun, and S. M. Bowers, “A- 76dbm 7.4 nw wakeup radio with automatic offset compensation,” in *2018 IEEE International Solid-State Circuits Conference (ISSCC)*, pp. 452–454, IEEE, 2018.

[33] Z. Yang, J. Yin, W.-H. Yu, H. Zhang, P.-I. Mak, and R. P. Martins, “A ulp long-range active-rf tag with automatic antenna-interface calibration achieving 20.5% tx efficiency at-22dbm eirp, and-60.4 dbm sensitivity at 17.8 nw rx power,” in *2023 IEEE International Solid-State Circuits Conference (ISSCC)*, pp. 30–32, IEEE, 2023.

[34] M. Jin, Y. He, X. Meng, Y. Zheng, D. Fang, and X. Chen, “Fliptracer: Practical parallel decoding for backscatter communication,” in *Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking*, MobiCom ’17, (New York, NY, USA), p. 275–287, Association for Computing Machinery, 2017.

[35] X. Guo, L. Shangguan, Y. He, N. Jing, J. Zhang, H. Jiang, and Y. Liu, “Saiyan: Design and implementation of a low-power demodulator for lora backscatter systems,” in *19th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2022, Renton, WA, USA, April 4-6, 2022* (A. Phanishayee and V. Sekar, eds.), pp. 437–451, USENIX Association, 2022.

[36] H. Lu, M. Mazaheri, R. Rezvani, and O. Abari, “A millimeter wave backscatter network for two-way communication and localization,” in *Proceedings of the ACM SIGCOMM 2023 Conference*, pp. 49–61, 2023.

[37] K. M. Bae, N. Ahn, Y. Chae, P. Pathak, S.-M. Sohn, and S. M. Kim, “Omniscatter: extreme sensitivity mmwave backscattering using commodity fmcw radar,” in *Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services*, pp. 316–329, 2022.

[38] E. Soltanaghaei, A. Prabhakara, A. Balanuta, M. Anderson, J. M. Rabaey, S. Kumar, and A. Rowe, “Millimetro: mmwave retro-reflective tags for accurate, long range localization,” in *Proceedings of the 27th Annual International Conference on Mobile Computing and Networking*, pp. 69–82, 2021.

[39] K. M. Bae, H. Moon, S.-M. Sohn, and S. M. Kim, “Hawkeye: Hectometer-range subcentimeter localization for large-scale mmwave backscatter,” in *Proceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services*, pp. 303–316, 2023.

[40] A. Kludze and Y. Ghasempour, “{LeakyScatter}: A {Frequency-Agile} directional backscatter network above 100 {GHz},” in *20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23)*, pp. 375–388, 2023.

[41] J. Jang and F. Adib, “Underwater backscatter networking,” in *Proceedings of the ACM special interest group on data communication*, pp. 187–199, 2019.

[42] A. Eid, J. Rademacher, W. Akbar, P. Wang, A. Allam, and F. Adib, “Enabling long-range underwater backscatter via van atta acoustic networks,” in *Proceedings of the ACM SIGCOMM 2023 Conference*, pp. 1–19, 2023.

[43] MIT Startup Exchange, “Jeeva wireless: A low-power backscatter communication chip.” <https://startupechange.mit.edu/startup-features/jeeva-wireless>, 2024. Accessed: 2024-08-25.

[44] M. Rostami, X. Chen, Y. Feng, K. Sundaresan, and D. Ganesan, “Mixiq: re-thinking ultra-low power receiver design for next-generation on-body applications,” in *Proceedings of the 27th Annual International Conference on Mobile Computing and Networking*, MobiCom ’21, (New York, NY, USA), p. 364–377, Association for Computing Machinery, 2021.

[45] X. Guo, L. Shangguan, Y. He, N. Jing, J. Zhang, H. Jiang, and Y. Liu, “Saiyan: Design and implementation of a low-power demodulator for {LoRa} backscatter systems,” in *19th USENIX Symposium on Networked Systems Design and Implementation (NSDI 22)*, pp. 437–451, 2022.